Skip to content

feat(llmobs): auto-tag git commit and repository URL on traces and ex…#17939

Open
gsvigruha wants to merge 9 commits into
mainfrom
gergely.svigruha/capture-git-commit
Open

feat(llmobs): auto-tag git commit and repository URL on traces and ex…#17939
gsvigruha wants to merge 9 commits into
mainfrom
gergely.svigruha/capture-git-commit

Conversation

@gsvigruha
Copy link
Copy Markdown
Contributor

@gsvigruha gsvigruha commented May 6, 2026

Description

Auto detect git variables and add as tags.

  • For both prod and experiment tracing try to detected git variables and attach as tags
    • use the default dd-trace mechanism
    • use a fallback mechanism to detect these vars from git rev parse (because in a typical dev env we don't have these as env vars, but we can call a git process)

Testing

  • Build wheel locally
  • Run the experiment onboarding notebook with new wheel
  • Verify tags

Risks

Performance: both mechanisms are computing the git vars once and caching it for the lifetime of the process so should not be an issue

…periments

Reuses ddtrace.internal.gitmetadata.get_git_tags() so LLM Observability spans
and Experiment._tags pick up git.commit.sha and git.repository_url from the
existing DD_GIT_* env vars or the main package's Project-URL metadata, gated
by DD_TRACE_GIT_METADATA_ENABLED. User-supplied experiment tags with the same
keys take precedence.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@gsvigruha gsvigruha requested review from a team as code owners May 6, 2026 23:53
@cit-pr-commenter-54b7da
Copy link
Copy Markdown

cit-pr-commenter-54b7da Bot commented May 6, 2026

Codeowners resolved as

releasenotes/notes/llmobs-git-metadata-tags-c59fc6c93e773536.yaml       @DataDog/apm-python

@datadog-prod-us1-5
Copy link
Copy Markdown
Contributor

datadog-prod-us1-5 Bot commented May 7, 2026

Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: c32a37f | Docs | Datadog PR Page | Give us feedback!

gsvigruha and others added 2 commits May 6, 2026 20:28
Adds an experiment-only fallback in _resolve_experiment_git_metadata that
shells out to `git rev-parse HEAD` / `git ls-remote --get-url` when the
standard gitmetadata source (DD_GIT_* env vars / package Project-URL) is
empty. Cached for the process lifetime; URL is scrubbed via
_filter_sensitive_info. Prod LLM Obs span tagging is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Test the resolver _resolve_experiment_git_metadata directly instead of
through Experiment(...) for fallback scenarios; parametrize the three
gitmetadata-vs-fallback combinations; mock the resolver in the two
Experiment integration tests so they no longer reach down into
gitmetadata + extract_*. Same coverage, ~50 fewer lines.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread ddtrace/llmobs/_experiment.py Outdated
not repeatedly invoke ``git``.
"""
global _GIT_FALLBACK_CACHE
repository_url, commit_sha, _ = gitmetadata.get_git_tags()
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not sufficient in a typical dev env

Copy link
Copy Markdown
Contributor

@Yun-Kim Yun-Kim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure

Comment thread ddtrace/llmobs/_experiment.py Outdated
Comment thread ddtrace/llmobs/_experiment.py Outdated
gsvigruha and others added 6 commits May 11, 2026 14:44
Move the gitmetadata + git-CLI fallback chain into _utils.resolve_llmobs_git_metadata.
Resolve once at LLMObs.enable() and store the pair on LLMObs class attrs;
_process_llm_span reads from those attrs instead of recomputing per span.
Experiment.__init__ reads the same values from LLMObs when enabled, so
constructing many experiments no longer re-shells-out to git. Drops the
module-level _GIT_FALLBACK_CACHE global in favor of singleton-scoped state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
resolve_llmobs_git_metadata short-circuits to ("", "") when
gitmetadata.config.enabled is False, so the experiment shellout fallback
respects the same disable flag as the standard gitmetadata path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nit__

The helper was a 4-line wrapper with one caller. Inlining drops a layer
of indirection and a function-scoped LLMObs import. Tests are reworked
to mock LLMObs class attrs directly rather than the now-removed helper,
and a fallback-when-disabled test covers the distributed-bootstrap path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drops the LLMObs class-attr shortcut from Experiment construction. Each
Experiment construction now goes through resolve_llmobs_git_metadata
directly; the per-Experiment shellout cost (~10-50ms when env vars are
empty) is negligible against typical experiment run times. Also trims
the resolver docstring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The note still described the old experiment-only shellout fallback and
the process-lifetime cache. The fallback now applies to spans too and
experiments resolve per-construction, so the wording needed an update.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants